Robust signature for signal authentication

ABSTRACT

A method, system and computer readable medium for the authenticating of an audio-visual signal, such as digital images or video comprising the generation of a robust image signature with variable size. In a preferred embodiment DC-values of blocks of a digital image are calculated and areas with similar DC-values are merged into regions. The signature is based on said regions and of variable length, depending on the desired localisation ability or an allowable signature length. The resulting signature bits are robust to compression and other allowable image operations. The hierarchical solution provides both robustness and tampering localisation.

FIELD OF THE INVENTION

This invention relates in general to the field of signal authenticationand more particularly to the authentication of digital images and video.

BACKGROUND OF THE INVENTION

The success of digital imaging and video has lead to a wide use of thistechnology in many fields of everyday life. Technology to edit, alter ormodify digital images or video sequences is commercially available andallows modifications of the contents of said images or videos withoutleaving traces. For a variety of applications, such as evidentialimaging in law enforcement, medical documentation, damage assessment forinsurance purposes, etc., it is necessary to ensure that an image orvideo has not been modified and is congruent with the image or videooriginally taken. This led to the development of image or videoauthentication systems for which an example is shown in FIG. 1, whereina signature or a watermark is created at 1.20 for a digital signal, i.e.an image or video, which is acquired in 1.10. The signature is embeddedat 1.30 in the digital image or video. Thereafter the image or video isprocessed or tampered in 1.40, played, recorded or extracted in 1.50 andfinally verified in 1.60 in order to either ensure that the authenticityof the digital image or video is proven or that modifications of thedigital image or video are revealed.

In certain situations some changes to images are desired and allowableand should not be classified as malicious tampering when validating theauthenticity of the images/video. Such changes occur e.g. when applyinglossy compression to the digital image in order to reduce storagecapacity or increase transmission rate. Lossy compression causes imagemodifications, but not to an extent that degrades the intended use ofthe images. An example for such a compression technique is the JPEGimage file format, which reduces the size of a digital imageconsiderably, i.e. the bit and byte sequence of the image is modified,while the perceptual information of the image is maintained.

Therefore a need exists for image authentication which distinguishesbetween allowable image modifications, such as lossy compression, andmalicious tampering, such as the replacement of an image area with newcontent or with content copied from an earlier or later point in time ofthe same scenery.

One approach to authenticate images is to use classical cryptography,whereby a digital image is converted to a hash using a cryptographickey. The generated hash is taken as a “fingerprint” of the digitalimage. A digital image which authenticity is to be validated isconverted to a hash using the same cryptographic key. If the new hash isexactly the same as the originally generated hash, the authenticity ofthe image is validated. By its nature classical cryptography is bitsensitive and a change of one bit in the original digital signal resultsin a completely different hash. Thus, when one bit of the image to bevalidated is changed during e.g. transmission or storage by e.g.compression, the image to be validated is classified as being tampered.Thus classical cryptography is not suited for authentication of adigital image having the above requirements concerning allowablemodifications of the image.

An alternative is the embedding of semi-fragile watermarks or thecreation of robust digital signatures. Both concepts maintain theperceptual information of the image and are based on generatingadditional information from the digital image and hiding the informationin the image itself or its framework, or by transmitting or storing theadditional information separately as “meta-data” with the image.

Semi-fragile embedded watermarks for authentication purposes providetolerance against allowable operations such as compression at modestcompression rates. However, when the digital signal has been tampered,watermark detection fails in areas of the original signal which havebeen tampered. The embedding of semi-fragile watermarks typically failsto provide the ability to distinguish between innocuous and malicioussignal modifications. Furthermore it is fragile because the watermarkstypically cannot survive high compression ratios. Also, in certain casessuch as flat regions, a watermark cannot be embedded. Finally, it is notpossible to identify tampering of the digital signal when flat contentsis inserted in the image during tampering.

Robust signatures are a set of bits which summarises the content of theimage and which is relatively unchanged by compression or otherallowable operations, but altered considerably by tampering. Many imageproperties can be used for computing a signature, e.g. edges, moments,DC-values, histograms, compression invariants, and projections ontosmoothed noise patterns. All methods of generating signatures have incommon that the size of the signature increases rapidly with the levelof protection, i.e. the ability to accurately localise tampering. Thisposes a problem due to storing and transmitting requirements becausewhen embedding the signature into the digital image, the size of asignature is critical, especially when embedding the signature as arobust watermark. A robust watermark is defined as a watermark whichallows correct extraction of the payload bits even after operations thatsignificantly degrade or damage the image, such as heavy compression ortampering by e.g. replacement of some pixels. A robust watermark allowsin contast to semi-fragile watermarks to extract the payload correctly,even after tampering. However, the payload of robust watermark schemesis very limited, typically just tens of bits. Thus, the problem to besolved by the invention is defined as how to provide a robust tamperdetection for an audio-visual signal such as a digital image or a video,allowing localisation of tampering in the signal, but adding littlepayload to the signal.

SUMMARY OF THE INVENTION

The present invention overcomes the above-identified deficiencies in theart and solves the above problem by providing a method and system ofauthenticating an audio-visual signal comprising the formation of aprogressive and robust signature which has a small and adaptablesignature size, according to the appended independent claims.

According to embodiments of the invention, a method, an apparatus, and acomputer-readable medium for authenticating an audio-visual signal isdisclosed. A progressive signature is formed for the audio-visual signalwhereby the audio-visual signal is split into blocks. The DC-values ofthe blocks are then calculated. The blocks are assigned to regions withsimilar DC-values and DC-differences between the regions are calculatedthereafter. Finally signature bits are generated based on theDC-differences calculated.

These and other aspects of the invention will be apparent from andelucidated with reference to the embodiments described hereafter.

BRIEF DESCRIPTION OF THE DRAWINGS

Preferred embodiments of the present invention will be described in thefollowing detailed disclosure, reference being made to the accompanyingdrawings, in which

FIG. 1 shows a Prior Art Image Authentication System;

FIG. 2 is a flowchart illustrating the method according to an embodimentof the method according to the invention;

FIG. 3 is a flowchart illustrating how a digital image is split intoblocks;

FIG. 4 is a flowchart illustrating the assignment of blocks intoregions;

FIG. 5 is a flowchart illustrating the regional difference calculation;

FIG. 6 is a flowchart illustrating the signature bit generation;

FIG. 7 is a flowchart illustrating an alternative block generation inthe second run;

FIG. 8 illustrates an apparatus according to another embodiment of theinvention; and

FIG. 9 illustrates a computer readable according to still anotherembodiment of the invention.

DESCRIPTION OF PREFERRED EMBODIMENTS

Signature Generation

FIG. 2 shows a flowchart of a preferred embodiment according to theinvention. A process 2 of generating a signature for a digital image isinitiated in step 2.10. A counter variable i is set to zero. In step2.20, a loop sequence starts and the variable M, defining the number ofblocks to be generated in step 2.30, is set to a value M(i) which isallocated to the current loop and the variables N1 and N2, defining thesize of the M blocks to be generated in step 2.30, are set to valuesN1(i) and N2(i) which are allocated to the current loop. The value of Mis at least 1 and maximally equivalent to the maximal number of pixelsin any direction of the digital image. The values of N1 and N2 definingthe block size are based on the number of blocks and the size of thedigital image, thus at least 1 and maximally equivalent to the maximalnumber of pixels in any direction of the digital image. In the firstloop, the values for M, N1 and N2 are chosen such that relatively largeblocks are generated, e.g. blocks of the size 128×128 pixels.

In step 2.30 the image is subdivided into M(i) blocks of sizeN1(i)×N2(i). Thereafter, the DC-values of each block is calculated instep 2.40. The DC-value is defined as the mean luminance.

The M(i) blocks are assigned to regions with similar DC-values in step2.50, which is described in more detail in connection with reference toFIG. 4.

The differences between mean DC-values of the above assigned regions iscalculated in step 2.60 and is described in more detail in connectionwith reference to FIG. 5.

Signature bits are generated in step 2.70, preferably by thresholdingthe above differences. This step is described more detailed below inconnection with reference to FIG. 6.

In step 2.80 the decision is taken if the loop will be continued bygoing back to step 2.20 or if the signature generation is terminated instep 2.90 when the desired number of signature bits has been generated.

FIG. 7 shows an alternative to step 2.30 in FIG. 2. Instead of splittingup the entire image from scratch, the regions formed at the previouslevel can be split up. This can be done starting from the second run ofthe loop 2.20 to 2.80 when regions already have been generated in theprevious run.

An exemplary loop illustrating the DC value calculation for all blocksis shown in FIG. 3. A loop variable j is initialised in 3.10 and theDC-value of block B(j) is calculated in 3.20. In step 3.30 the loopvariable is increased an if the DC-values for all blocks have not beencalculated, step 3.20 is repeated for the next block, otherwise thecalculation of block DC-values is terminated.

FIG. 4 illustrates how the blocks are assigned to regions. Firstly, ablock is picked according to a pseudo random sequence in 4.10. ThisBlock B(Rnd) becomes the first block in a region in step 4.20. Theinitial DC value of the region is thus the same as the DC-value of thefirst block. Next, the neighbouring blocks to the first block areexamined in 4.30 to 4.60. In step 4.40 it is examined if the absolutedifference between the DC value of the block currently examined and thecurrent region is less than a threshold T_1, and in case the DC-value islower, then the block is assigned to the current region in step 4.50 andthe DC-value of the region is updated. If not all neighbouring blockshave been examined, the loop goes back to step 4.30 in step 4.60 andcontinues with the next neighbouring block. In 4.70 it is checkedwhether each of the available blocks is assigned to a region and theloop goes back to step 4.10 if a block has not yet been assigned to aregion.

The resulting regions' DC-values are arranged according to FIG. 5 in theorder in which the regions were formed according to FIG. 4 and thedifference of the DC-values of the regions is calculated, i.e.[DCr2−DCr1, DCr3−DCr2, DCr4−DCr3, . . . ] in step 5.20. Step 5.10initiates a counter variable s and step 5.30 checks if all differenceshave been calculated. If this is not the case, the loop branches back tostep 5.20 and calculates the next DC-difference-value until the lastvalue is calculated.

Signature bits are generated according to FIG. 6, whereby a countervariable s is initialised in step 6.10 and it is checked in step 6.50 ifall differences calculated above (FIG. 5) have been converted tosignature bits. In step 6.20 it is checked if the current examineddifference is larger than a threshold T_2. If this is positive, thecurrent signature bit is assigned a ‘0’ in 6.30, otherwise the currentsignature bit is set to ‘1’ in 6.40.

The above-described preferred embodiment according to the inventionprovides a hierarchical approach. Firstly, a few signature bits arecalculated at a coarse level using large block sizes. Then furthersignature bits are calculated at finer levels using progressivelysmaller block sizes. The earliest bits in the signature are the mostrobust, but can provide only poor localisation of any image alterationsdue to the large block size used. Thus the signature incorporatesinformation concerning the whole image first at a coarse level, and thenat progressively more detailed levels, continuing as far as thepermitted signature size allows.

The localisation ability of the above-preferred embodiment according tothe invention is automatically adapted to the image content. In contentwith sparse details and many flat regions, many blocks will be merged.This results in a few regions of large size and therefore only a fewsignature bits. This allows to progress to smaller block sizes and thusbetter localisation is achieved with the same signature length.

It is understood, that the signature bits are determined by the mostimportant details of the digital image, because the regions that areformed are used to generate the signature bits. Boundaries between theregions are in areas where edges and details occur in the digital image.

In order to optimise performance, the block sizes at each level and alsothe values of the threshold used at each level to decide whether or notto add a block to a region, are adjustable.

By using differences between DC-values of different image blocks, theabove described signature generation is robust as DC differences arerobust image property that is little affected by compression and otheracceptable non-malicious image alterations. Therefore DC-values arepreferred in the above-described embodiment according to the invention.Though, other image properties can also be used for computing a variablesignature according to the invention, such as edges, moments,histograms, compression in variants, and projections onto smoothed noisepatterns.

In order to successfully forge an apparently authentic image, anyaltered content would have to be replaced by content giving a similar DCvalue in each block considered during the signature formation.Successful forgery is very difficult because the block boundaries areunknown to the forger.

Security is also provided via the pseudo random sequence that determinesthe choice of blocks with which to begin the formation of regions. Usinga pseudo random sequence to determine the order in which the comparisonswith neighbouring blocks' DC values are made is used to build additionalsecurity.

Varying the position of the grid defining the block boundaries providesfurther security against forgery. In this case, a means is provided inthe authentication device to determine the grid boundary prior tochecking the signature. One such means is to indicate the grid locationvia a separate watermark detectable by the authentication device. Thishas the additional advantage of providing increased robustness in thepresence of jitter and cropping.

In some applications, such as security imaging, only one of a pluralityof frames, e.g. one frame in every 50 frames, is stored. It is thereforeimportant that each frame is capable of authenticating itself withoutreference to preceding or subsequent frames. The above method meets thisrequirement as it treats each video frame as a separate still image.This also means that the method is equally applicable to both stillimages and video.

In the above method signature formation continues until the requirednumber of bits has been generated. The allowable signature length isdetermined by the number of bits of meta-data that may bestored/transmitted, or by the payload capacity of the watermark thatwill be used to embed the signature into the image. The longer thesignature is, the more precise is the localisation of tampered imageregions.

FIG. 8 illustrates an embodiment of the invention in an apparatus. Asystem 10 for authenticating an audio-visual signal is shown. Anaudio-visual signal is generated in 20. Preferably the audio-visualsignal is captured in 20 by an image capturing device camera, such as asurveillance camera or a CCD array and/or an appropriate means forcapturing the audio signal, such as a microphone. However, theaudio-visual signal may also originate from a transmission signal, suchas a video signal, or from a storage device, such as a harddisk drive orsimilar computer readable medium. A means 30 splits the audio-visualsignal into blocks. The DC-value for each block is calculated by a means40. Blocks with similar DC values are assigned by means 50 to regions.Thereafter DC-differences between the regions, i.e. differences of theDC-values of the above regions, are calculated by means 60 and thensignature bits are generated by means 70 based on the DC-differencescalculated by means 60. In 80 the signature generated is furtherprocessed, i.e. preferably embedded in the audio visual signal,preferably as a robust watermark. Means 30, 40, 50, 60 and 70 arepreferably implemented in the system 10 as a module, preferablycomprising a microprocessor or similar electronic device such as aprogrammable array or similar electronic circuit.

FIG. 9 illustrates another embodiment of the invention comprising acomputer readable medium 100 for authenticating an audio-visual signal.An audio-visual signal is generated in 120. Preferably the audio-visualsignal is captured in 120 by an image capturing device camera, such as asurveillance camera or a CCD array and/or an appropriate means forcapturing the audio signal, such as a microphone. However, theaudio-visual signal may also originate from a transmission signal, suchas a video signal, or from a storage device, such as a harddisk drive orsimilar computer readable medium. A first program module 130 directs acomputer 110 to split the audio-visual signal into blocks. The computer110 comprises a processor 111 executing the computing instructions fromthe program modules described herein. The DC-value for each block iscalculated by a second program module 140 directing the computer 110.Blocks with similar DC values are assigned to regions by a third programmodule 150 directing the computer 110. Thereafter DC-differences betweenthe regions, i.e. differences of the DC-values of the above regions, arecalculated by a fourth program module 160 directing computer 110 andthen signature bits are generated by a fifth program module 170directing computer 110 based on the DC-differences calculated by programmodule 160. In 180 the signature generated is further processed, i.e.preferably embedded in the audio visual signal, preferably as a robustwatermark.

The method, apparatus and program instructions described above includeinherent flexibility in that more signature bits, giving increasedlocalisation ability, can easily be generated when the development ofstorage or transmission or watermark techniques advances, e.g. due tohigher available transmission rates or larger memory chips at lowerprice, and allows a larger signature.

By using differences between DC-values of a digital image, the resultingsignature bits are robust to compression and other allowable imageoperations.

The merging of areas with similar DC-values results in a signature withreduced size and further increases the robustness of the signaturebecause small DC differences give unreliable signature bits.

The hierarchical approach provides both robustness and tamperinglocalisation.

The signature is automatically adapted to the image content because forimages which contain fewer details, the signature provides automaticallyincreased localisation.

Thanks to the fact that the calculation of the signature does not demandhigh computational power, the signature can be embedded into video.

Flexibility is provided because the signature can be truncated atwhatever length suits the current application. Improvements in otherparts of the system which allow a larger signature to be generated,result directly in improved tamper localisation.

Hence, the method according to the invention has the advantages ofrobustness, tamper localisation, short signature length, andflexibility.

In order to judge the authenticity of an image, a similar procedure tothe signature formation is used. However, at each comparison of DCvalues, a “soft-decision” is taken i.e. the value of the DC differencesis used to judge the probability of whether a block was merged into aregion or not. In this manner a trellis is constructed that charts theprobability of different signatures being formed from the receivedimage. In one embodiment, a Viterbi type algorithm is used to determinethe most probable manner of forming a signature matching that generatedfrom the original image. The signature given by the watermark providesthe most likely region formation, but that special care is taken tocontrol error probabilities.

An authentic image shows only minor changes in its DC values due tocompression or other allowable processing. There is therefore a highprobability of the received image being able to generate a matchingsignature. If, however, the image has been maliciously altered, at leastone of the calculated DC differences is significantly altered, and theoverall probability of the suspect image generating a matching signatureis low, indicating that larger image modifications have occurred, aswell as that at one or more points during the generation of a matchingsignature, some of the decisions that have to be made will have a verylow associated probability. For example, for the image to be authentic,compression has to have caused a large DC difference to have become verysmall, which is very unlikely. This fact can be used to localise whichregions of the image have been tampered.

Image authentication is an asymmetric problem in the sense that verymany images need to be equipped with authentication capability, but onlyrelatively few of these will actually have their authenticity checked.In the security camera scenario, for example, all frames generated bythe camera have their signature calculated, but perhaps only one framein every fifty will actually be recorded. Moreover, only a very smallnumber of the recorded frames is needed to be authenticated, e.g.authentication which is required in the event that these frames are usedas evidence in a court case. The result is that the signaturecalculation performed for every frame must have lower computationalrequirements than the rarely performed authenticity verification.

In a preferred embodiment of the invention, the signature calculation istherefore positioned close to the image capture device in order toprevent the possibility of tampering before the signature is calculated.In this case the signature calculation and, if appropriate, embedding ofit as a watermark, has to take place in real-time on the video streaminside a camera. This places severe constraints upon the complexity ofthe signature calculation and watermark embedding algorithms.

The signature generation method described above is based uponcalculating DC values, which is not a computationally demanding ormemory hungry task. Signature generation and embedding according to theabove method can therefore be done in real-time.

In case tampering is detected, an analysis of the modification isundertaken. Applications and use of the above described signalauthentication according to the invention are various and includeexemplary fields such as

-   -   security cameras or surveillance cameras, such as for law        enforcement, evidential imaging or fingerprints,    -   health care systems such as telemedicine systems, medical        scanners, and patient documentation,    -   insurance documentation applications such as car insurance,        property insurance and health insurance.

The present invention has been described above with reference tospecific embodiments. However, other embodiments than the preferredabove are equally possible within the scope of the appended claims.

Furthermore, the term “comprising” does not exclude other elements orsteps, the terms “a” and “an” do not exclude a plurality and a singleprocessor or other unit may fulfil the functions of several of the unitsor circuits recited in the claims.

1. A method of authenticating an audio-visual signal comprisingformation of a progressive signature by generating a variable number ofsignature bits.
 2. A method according to claim 1 comprising the steps ofsplitting said audio-visual signal into blocks and progressivelydecreasing the size of said blocks.
 3. A method according to claim 2further comprising the steps of generating said signature from thecontents of said blocks, whereby said number of signature bitsprogressively increases with decreasing block size.
 4. A methodaccording to claim 1 further characterised in that said number ofsignature bits increases with the complexity of said audio-visualsignal.
 5. A method according to claim 4 further characterised by thesteps of splitting said audio-visual signal into blocks, merging similarblocks into regions, and generating said signature based on saidregions.
 6. A method according to claim 5, the steps of merging similarblocks into regions and generating said signature based on said regionscomprising the steps of calculating an image characteristics value foreach of said blocks, assigning blocks with similar image characteristicsvalues to regions, calculating differences between image characteristicsvalues of said regions, and generating said number of signature bitsbased on said differences between said image characteristics values ofsaid regions.
 7. A method according to claim 6, said imagecharacteristics values being DC-values.
 8. A method according to claim 6further characterised in that said steps for the formation of saidprogressive signature are at least once looped.
 9. A method according toclaim 8 further characterised in that the size of said blocks isdecreased in each loop.
 10. A method according to claim 1 furthercharacterised in that the length of said signature with a variablenumber of signature bits is limited to a maximum signature length.
 11. Amethod according to claim 10 further comprising the step of embeddingsaid signature in said audio-visual signal as a watermark, said maximumsignature length being defined as the maximum payload of the watermark.12. A method according to claim 1 further comprising the steps ofimplanting said signature in said audio-visual signal and/or storing ortransmitting said audio-visual signal.
 13. A method according to claim12 whereby the signature is implanted in the audio-visual signal as awatermark.
 14. A method according to claim 1 further comprising the stepof verifying the authenticity of said audio-visual signal by verifyingsaid signature.
 15. A method according to claim 7 whereby the step ofassigning said blocks to regions with similar DC values comprisesrepeating the steps of: picking a first block not yet assigned to aregion according to a pseudo-random sequence wherein said first blockbecomes the first block of a new region and the DC-value of said firstblock becomes the DC-value of said new region, and examining eachneighbouring block of said first block whereby a further block of saidneighbouring blocks is assigned to said new region and the DC-value ofthe new region is updated with the DC-value of the further block if theDC-value of said further block is less than a threshold, until allblocks are assigned to a region.
 16. A method according to claim 7whereby the step of calculating DC-differences between said regionscomprises the steps of arranging the DC-values of said regions in theorder in which the regions are formed and calculating saidDC-differences between consecutive regions for all regions.
 17. A methodaccording to claim 6 whereby the step of splitting said audio-visualsignal into blocks is characterised by said blocks being formed in apreviously formed region.
 18. A method according to claim 7 whereby saidstep of generating signature bits based on said DC-differences ischaracterised by thresholding said DC-differences.
 19. A methodaccording to claim 1, wherein said audio-visual signal is a digitalimage or frame of a digital video.
 20. A system for authenticating anaudio-visual signal comprising a device for formation of a progressivesignature generating a variable number of signature bits.
 21. A systemfor authenticating an audio-visual signal according to claim 20, saiddevice for formation of a progressive signature comprising a means forsplitting said audio-visual signal into blocks, a means for calculatingthe DC value of said blocks, a means for assigning said blocks toregions with similar DC values, a means for calculating DC-differencesbetween said regions, and a means for generating said signature bitswhereby the signature bits are based on said DC differences.
 22. Acomputer readable medium having a plurality of computer-executableinstructions for performing the method according to claim 1 comprising aprogram module for formation of a progressive signature givinginstructions to a computer for generating a variable number of signaturebits.
 23. A computer readable medium according to claim 22 furtherhaving a plurality of computer-executable instructions comprising afirst program module for splitting an audio-visual signal into blocks, asecond program module for calculating the DC value of said blocks, athird program module for assigning said blocks to regions with similarDC values, a fourth program module for calculating DC-differencesbetween said regions, and a fifth program module for generating saidsignature bits, said signature bits being based on said DC differences.24. Use of the method according to claim 1 in a surveillance camera orsecurity camera or digital image camera or digital video camera or amedical imaging system.